Germans Trias i Pujol Research Institute and Hospital (IGTP)
Badalona
March 27, 2025
1. Analysis of variance
2. One-Way Analysis of variance
3. Two-Way Analysis of variance
4. Repeated Mesures Analysis of variance
5. Final messages
\[ H_0: \mu_1 = \mu_2 \]
\[ H_1: \mu_1 \ne \mu_2 \]
| Characteristic | A N = 301 |
B N = 301 |
p-value2 |
|---|---|---|---|
| Weight (Kg) | 46.8 (11.7) | 52.8 (12.0) | 0.058 |
| 1 Mean (SD) | |||
| 2 Welch Two Sample t-test | |||
\[ H_0: \mu_1 = \mu_2 = \mu_3 \]
\[ H_1: \mu_1 \ne \mu_2 \ne \mu_3 \]
| Characteristic | A N = 301 |
B N = 301 |
C N = 301 |
|---|---|---|---|
| Weight (Kg) | 46.7 (8.5) | 52.2 (9.7) | 41.2 (8.7) |
| 1 Mean (SD) | |||
Analysis of Variance (ANOVA) is a statistical method used to compare the means of three or more groups to determine if there are significant differences between them.
Conditions:
Variance is a measure of dispersion that quantifies how much a set of values deviates from their mean.
\[ \sigma^2=\frac{\sum_{i=1}^{N}{(x_i-\mu)^2}}{N} \]
\[ S^2=\frac{\sum_{i=1}^{n}{(x_i-\bar{x})^2}}{(n-1)} \]
The standard deviation is the square root of the variance.
Analysis of variance (ANOVA) is based on the decomposition of total variability into two components:
Total Variance = Between-Group Variance + Within-Group Variance
Between-Group Variance: how much the group means deviate from the grand mean, weighted by the number of subjects in each group.
Within-Group Variance: how much individual observations deviate from their respective group means.
Decomposition of total variability is expressed as:
\[\sum_{j=1}^{g}\sum_{i=1}^{n_j}(x_{ij}-\bar{X})^2 = \sum_{j=1}^{g}n_j(\bar{x_j}-\bar{X})^2 + \sum_{j=1}^{g}\sum_{i=1}^{n_j}(x_{ij}-\bar{x_j})^2\] where
Show how each observation deviates from then grand mean.
\[\sum_{j=1}^{3}\sum_{i=1}^{30}(x_{ij}-\bar{X})^2\]
Show the difference between each group’s mean and the overall mean.
\[\sum_{j=1}^{3}n_j(\bar{x_j}-\bar{X})^2\]
Show how each observation deviates from its group’s mean.
\[\sum_{j=1}^{3}\sum_{i=1}^{30}(x_{ij}-\bar{x_j})^2\]
The p-value is obtained from the F-distribution with \((g-1, N-g)\) degrees of freedom.
\(H_0: \mu_1 = \mu_2 = \mu_3\)
\(H_1: \mu_1 \ne \mu_2 \ne \mu_3\)
The probability of obtaining a \(F=\frac{MS_b}{MS_w}\) as extreme as the observed (or more extreme) under the assumption that the null hypothesis is true is 0.1123655. We can’t reject the null hypothesis.
| ANOVA Table | |||||
|---|---|---|---|---|---|
| Terms | df | SS | MSQ | F | P-value |
| Between-groups | 2 | 512.5434 | 256.2717 | 2.241856 | 0.1123655 |
| Within-groups | 87 | 9945.1673 | 114.3123 | NA | NA |
\(H_0: \mu_1 = \mu_2 = \mu_3\)
\(H_1: \mu_1 \ne \mu_2 \ne \mu_3\)
The probability of obtaining a \(F=\frac{MS_b}{MS_w}\) as extreme as the observed (or more extreme) under the assumption that the null hypothesis is true is <0.0001. We can reject the null hypothesis.
| ANOVA Table | |||||
|---|---|---|---|---|---|
| Terms | df | SS | MSQ | F | P-value |
| Between-groups | 2 | 6005.04 | 3002.5200 | 22.14413 | 1.683683e-08 |
| Within-groups | 87 | 11796.32 | 135.5899 | NA | NA |
ANOVA only tells us that at least one group is different (without identifying which ones)
A post-hoc test is a statistical procedure used after an ANOVA to determine which specific groups differ when the overall test finds a significant effect.
Post-hoc tests control for multiple comparisons to avoid inflating the Type I error rate.
The dependent variable (response) should be approximately normally distributed in each group.
ANOVA is robust to this assumption when groups have similar sample sizes
Can be checked using QQ-plots or the Shapiro-Wilk test.
Shapiro-Wilk test
\(H_0: \text{Data is Normally distributed}\)
\(H_1: \text{Data is not Normally distributed}\)
It’s a highly sensitive test to sample size.
The variance of the response variable should be approximately equal across groups (Homoscedasticity).
Can be checked using different type of plots AND the Levene’s test or Bartlett’s test.
Levene’s test
\(H_0: \text{All groups have equal variances}\)
\(H_1: \text{At least one group has a variance different from the others}\)
Uses absolute deviations from the mean/median to assess data dispersion. Is robust to non-normality.
Bartlett test
\(H_0: \text{All groups have equal variances}\)
\(H_1: \text{At least one group has a variance different from the others}\)
Based on the likelihood-ratio test. It requires normality.
The Effect of Vitamin C dose on Tooth Growth in Guinea Pigs
library(dplyr) # Data managment
library(gtsummary) # Summary of results
library(ggplot2) # Graphics
data(ToothGrowth) # Data
ToothGrowth <- ToothGrowth |>
mutate(dose=factor(dose)) # dose as factor
ToothGrowth |>
dplyr::select(-supp) |>
gtsummary::tbl_summary(
by=dose,
type = all_continuous() ~ "continuous2",
statistic=all_continuous()~c("{mean} ({sd})",
"{median} ({p25}, {p75})"),
digits = all_continuous() ~ 2,
label = list(len ~ "Length (cm)"))
# Plot
ggplot(ToothGrowth, aes(x = dose, y =len,color=dose)) +
geom_point(size = 4)+
geom_boxplot()+
labs(x = "Dose", y = "Length (cm)") +
theme_minimal() +
theme(legend.position = "none")| Characteristic | 0.5 N = 20 |
1 N = 20 |
2 N = 20 |
|---|---|---|---|
| Length (cm) | |||
| Mean (SD) | 10.61 (4.50) | 19.74 (4.42) | 26.10 (3.77) |
| Median (Q1, Q3) | 9.85 (7.15, 13.00) | 19.25 (16.00, 23.45) | 25.95 (23.45, 28.35) |
#QQplot
ggplot(ToothGrowth, aes(sample = len)) +
geom_qq() +
geom_qq_line() +
labs(title = "Q-Q Plots for Each Group", x = "Theoretical Quantiles", y = "Sample Quantiles") +
facet_wrap(~dose) +
theme_minimal()
#Shapiro-Wilk
ToothGrowth |>
group_by(dose) |>
summarise(p_value = shapiro.test(len)$p.value)# A tibble: 3 × 2
dose p_value
<fct> <dbl>
1 0.5 0.247
2 1 0.164
3 2 0.902
# Plot
ggplot(ToothGrowth, aes(x =dose, y = len, fill = dose)) +
geom_violin(alpha = 0.5) +
geom_jitter(width = 0.2, size = 2, alpha = 0.7) +
theme_minimal() +
labs(title = "Violin+Jittered: Variance Visualization",
x = "Dose",
y = "Length (cm)") +
theme(legend.position = "none")
# Levene Test
library(car)
leveneTest(len ~ dose, data = ToothGrowth)
# Bartlett Test
bartlett.test(len ~ dose, data = ToothGrowth)Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 0.6457 0.5281
57
Bartlett test of homogeneity of variances
data: len by dose
Bartlett's K-squared = 0.66547, df = 2, p-value = 0.717
# ANOVA: dependent ~ independent
model_anova <- aov(len ~ dose, data = ToothGrowth)
# Arrange results
results_anova <- broom::tidy(model_anova)
resultats_anova$term<-c("Between-groups","Within-groups")
names(resultats_anova)<-c("Terms","df","SS","MSQ","F","P-value")
# Tabulate results
results_anova |>
gt::gt() |>
gt::fmt_number(decimals = 3) |>
gt::tab_header(title = "Guinea Pigs: 1-way Anova") | Guinea Pigs: 1-way Anova | |||||
|---|---|---|---|---|---|
| Terms | df | SS | MSQ | F | P-value |
| Between-groups | 2.000 | 2,426.434 | 1,213.217 | 67.416 | 0.000 |
| Within-groups | 57.000 | 1,025.775 | 17.996 | NA | NA |
The probability of obtaining a \(F=\frac{MS_b}{MS_w}\) as extreme as the observed (or more extreme) under the assumption that the null hypothesis is true is <0.0001. We can reject the null hypothesis that all means are equal.
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = len ~ dose, data = ToothGrowth)
$dose
diff lwr upr p adj
1-0.5 9.130 5.901805 12.358195 0.00e+00
2-0.5 15.495 12.266805 18.723195 0.00e+00
2-1 6.365 3.136805 9.593195 4.25e-05
Pairwise comparisons using t tests with pooled SD
data: ToothGrowth$len and ToothGrowth$dose
0.5 1
1 2.0e-08 -
2 4.4e-16 4.3e-05
P value adjustment method: bonferroni
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Dunnett Contrasts
Fit: aov(formula = len ~ dose, data = ToothGrowth)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
1 - 0.5 == 0 9.130 1.341 6.806 1.34e-08 ***
2 - 0.5 == 0 15.495 1.341 11.551 < 1e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
dose emmean SE df lower.CL upper.CL
0.5 10.6 0.949 57 8.71 12.5
1 19.7 0.949 57 17.84 21.6
2 26.1 0.949 57 24.20 28.0
Confidence level used: 0.95
contrast estimate SE df t.ratio p.value
dose0.5 - dose1 -9.13 1.34 57 -6.806 <.0001
dose0.5 - dose2 -15.49 1.34 57 -11.551 <.0001
dose1 - dose2 -6.37 1.34 57 -4.745 <.0001
P value adjustment: tukey method for comparing a family of 3 estimates
# Plot
emmip(emm, ~ dose) +
theme_minimal() +
labs(title = "Estimated Marginal Means with CI",
y = "Estimated Mean Length",
x = "Dose")
# Plot estimated marginal means with confidence intervals
ggplot(as.data.frame(emm), aes(x = dose, y = emmean, ymin = lower.CL, ymax = upper.CL)) +
# Mean points
geom_point(size = 4, color = "blue") +
# Confidence intervals
geom_errorbar(width = 0.2, color = "blue") +
labs(title = "Marginal Means with 95% CIs",
x = "Dose",
y = "Estimated Tooth Length") +
theme_minimal() # Extract Standardized residuals and fitted values
ToothGrowth$resid <- rstandard(model_anova)
ToothGrowth$fitted <- fitted(model_anova)
# Normality
ggplot(ToothGrowth, aes(sample = resid)) +
geom_qq() +
geom_qq_line() +
labs(title = "Q-Q Plot of Residuals",
x = "Theoretical Quantiles",
y = "Sample Quantiles") +
theme_minimal()
# Residuals vs Fitted plot to check variance patterns
ggplot(ToothGrowth, aes(x = fitted, y = resid))+
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(title = "Residuals vs Fitted Values",
x = "Fitted Values",
y = "Residuals") +
theme_minimal() Welch’s correction
Call:
t1way(formula = len ~ dose, data = ToothGrowth)
Test statistic: F = 60.026
Degrees of freedom 1: 2
Degrees of freedom 2: 21.53
p-value: 0
Explanatory measure of effect size: 0.88
Bootstrap CI: [0.79; 1]
White’s correction
Main effect of Factor A:
\(H_0: \mu_{A1} = \mu_{A2} = \mu_{A3}\)
\(H_1: \mu_{A1} \ne \mu_{A2} \ne \mu_{A3}\)
Main effect of Factor B:
\(H_0: \mu_{B1} = \mu_{B2} = \mu_{B3}\)
\(H_1: \mu_{B1} \ne \mu_{B2} \ne \mu_{B3}\)
Interaction effect (A × B):
\(H_0: \text{NO interaction between A and B}\)
\(H_1: \text{Interaction between A and B}\)
Two-way Analysis of Variance (ANOVA) is a statistical method used to compare the means across levels of two independent factors. It tests for significant differences in means due to each factor individually (main effects) and whether the two factors interact, meaning the effect of one factor depends on the level of the other (interaction effect).
Conditions:
Total Variance = Between-Group A Variance + Between-Group B Variance + Interaction-Group Variance + Within-Groups Variance
Between-Group A Variance: Measures how much the group A means deviate from the grand mean, weighted by the number of subjects in each group
Between-Group B Variance: Measures how much the group B means deviate from the grand mean, weighted by the number of subjects in each group
Interaction Variance: Measures how much the combined effect of Factors A and B deviates from what we would expect if the effects were purely additive.
\(SS_{AxB}=\sum_{i=1}^{a}\sum_{j=1}^{b}n_{ij}(\bar{x_{ij}}-\bar{x_{i}}-\bar{x_{j}}-\bar{X})\)
Within-Group Variance: Measures how much individual observations deviate from their respective group means.
The Effect of Vitamin C dose and Supplement type on Tooth Growth in Guinea Pigs
library(dplyr) # Data managment
library(gtsummary) # Summary of results
library(ggplot2)# Graphics
data(ToothGrowth) # data
ToothGrowth <- ToothGrowth |>
mutate(dose=factor(dose), # dose as factor
supp=factor(supp)) # supp as factor
ToothGrowth |>
tbl_strata(
strata = supp, .tbl_fun = ~ .x |>
tbl_summary(by = dose,
missing = "no",
type = all_continuous() ~ "continuous2",
statistic=all_continuous()~c("{mean} ({sd})",
"{median} ({p25}, {p75})"),
digits = all_continuous() ~ 2,
label = list(len ~ "Length (cm)"))|>
.header = "**{strata}**, N = {n}"
)| Characteristic |
OJ, N = 30
|
VC, N = 30
|
||||||
|---|---|---|---|---|---|---|---|---|
| N | 0.5 N = 10 |
1 N = 10 |
2 N = 10 |
N | 0.5 N = 10 |
1 N = 10 |
2 N = 10 |
|
| Length (cm) | 30 | 30 | ||||||
| Mean (SD) | 13.23 (4.46) | 22.70 (3.91) | 26.06 (2.66) | 7.98 (2.75) | 16.77 (2.52) | 26.14 (4.80) | ||
| Median (Q1, Q3) | 12.25 (9.70, 16.50) | 23.45 (20.00, 25.80) | 25.95 (24.50, 27.30) | 7.15 (5.80, 11.20) | 16.50 (15.20, 17.30) | 25.95 (23.30, 29.50) | ||
The Effect of Vitamin C dose and Supplement type on Tooth Growth in Guinea Pigs
# A tibble: 6 × 3
# Groups: dose [3]
dose supp p_value
<fct> <fct> <dbl>
1 0.5 OJ 0.182
2 0.5 VC 0.170
3 1 OJ 0.415
4 1 VC 0.270
5 2 OJ 0.815
6 2 VC 0.919
# Plot
ggplot(ToothGrowth, aes(x =dose, y = len, fill = dose)) +
geom_violin(alpha = 0.5) +
geom_jitter(width = 0.2, size = 2, alpha = 0.7) +
theme_minimal() +
facet_wrap(~supp) +
labs(title = "Violin+Jittered: Variance Visualization",
x = "Dose",
y = "Length (cm)") +
theme(legend.position = "none")
# Levene Test
library(car)
leveneTest(len ~ dose*supp, data = ToothGrowth)
# Bartlett Test
bartlett.test(len ~ interaction(dose, supp), data = ToothGrowth)Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 5 1.7086 0.1484
54
Bartlett test of homogeneity of variances
data: len by interaction(dose, supp)
Bartlett's K-squared = 6.9273, df = 5, p-value = 0.2261
# ANOVA: dependent ~ independent
model_anova_2w <- aov(len ~ dose*supp, data = ToothGrowth)
# Arrange results
results_anova_2w <- broom::tidy(model_anova_2w)
results_anova_2w$term<-c("Between-groups: Dose","Between-groups: Supp","Interaction","Within-groups")
names(results_anova_2w)<-c("Terms","df","SS","MSQ","F","P-value")
# Tabulate results
results_anova_2w |>
gt::gt() |>
gt::fmt_number(decimals = 3) |>
gt::tab_header(title = "Guinea Pigs: 2-way Anova") | Guinea Pigs: 2-way Anova | |||||
|---|---|---|---|---|---|
| Terms | df | SS | MSQ | F | P-value |
| Between-groups: Dose | 2.000 | 2,426.434 | 1,213.217 | 92.000 | 0.000 |
| Between-groups: Supp | 1.000 | 205.350 | 205.350 | 15.572 | 0.000 |
| Interaction | 2.000 | 108.319 | 54.160 | 4.107 | 0.022 |
| Within-groups | 54.000 | 712.106 | 13.187 | NA | NA |
We reject the null hypothesis that all means are equal across dose groups.
We also reject the null hypothesis that all means are equal across supplement groups.
Furthermore, we reject the null hypothesis for the interaction effect, indicating that the effect of dose depends on the supplement type.
dose supp emmean SE df lower.CL upper.CL
0.5 OJ 13.23 1.15 54 10.93 15.5
1 OJ 22.70 1.15 54 20.40 25.0
2 OJ 26.06 1.15 54 23.76 28.4
0.5 VC 7.98 1.15 54 5.68 10.3
1 VC 16.77 1.15 54 14.47 19.1
2 VC 26.14 1.15 54 23.84 28.4
Confidence level used: 0.95
contrast estimate SE df t.ratio p.value
dose0.5 OJ - dose1 OJ -9.47 1.62 54 -5.831 <.0001
dose0.5 OJ - dose2 OJ -12.83 1.62 54 -7.900 <.0001
dose0.5 OJ - dose0.5 VC 5.25 1.62 54 3.233 0.0243
dose0.5 OJ - dose1 VC -3.54 1.62 54 -2.180 0.2640
dose0.5 OJ - dose2 VC -12.91 1.62 54 -7.949 <.0001
dose1 OJ - dose2 OJ -3.36 1.62 54 -2.069 0.3187
dose1 OJ - dose0.5 VC 14.72 1.62 54 9.064 <.0001
dose1 OJ - dose1 VC 5.93 1.62 54 3.651 0.0074
dose1 OJ - dose2 VC -3.44 1.62 54 -2.118 0.2936
dose2 OJ - dose0.5 VC 18.08 1.62 54 11.133 <.0001
dose2 OJ - dose1 VC 9.29 1.62 54 5.720 <.0001
dose2 OJ - dose2 VC -0.08 1.62 54 -0.049 1.0000
dose0.5 VC - dose1 VC -8.79 1.62 54 -5.413 <.0001
dose0.5 VC - dose2 VC -18.16 1.62 54 -11.182 <.0001
dose1 VC - dose2 VC -9.37 1.62 54 -5.770 <.0001
P value adjustment: tukey method for comparing a family of 6 estimates
# Plot estimated marginal means with confidence intervals
ggplot(as.data.frame(emm), aes(x = dose, y = emmean, color = supp, group = supp)) +
geom_point(size = 4) +
geom_line() +
geom_errorbar(aes(ymin = emmean - SE, ymax = emmean + SE), width = 0.2) +
scale_x_discrete(labels = c("0.5", "1", "2")) +
labs(x = "Dose", y = "Estimated Marginal Mean", color = "Supplement")
theme_minimal() # Extract residuals and fitted values
ToothGrowth$resid <- rstudent(model_anova_2w)
ToothGrowth$fitted <- fitted(model_anova_2w)
# Normality
ggplot(ToothGrowth, aes(sample = resid)) +
geom_qq() +
geom_qq_line() +
labs(title = "Q-Q Plot of Residuals",
x = "Theoretical Quantiles",
y = "Sample Quantiles") +
theme_minimal()
# Residuals vs Fitted plot to check variance patterns
ggplot(ToothGrowth, aes(x = fitted, y = resid)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(title = "Residuals vs Fitted Values",
x = "Fitted Values",
y = "Residuals") +
theme_minimal() In dependent samples, the same subjects are measured several times under different conditions.
Main effect of Time:
\(H_0: \mu_{T1} = \mu_{T2} = \mu_{T3}\)
\(H_1: \mu_{T1} \ne \mu_{T2} \ne \mu_{T3}\)
Main effect of Group:
\(H_0: \mu_{Control} = \mu_{Treatment}\)
\(H_1: \mu_{Control} \ne \mu_{Treatment}\)
Interaction effect (Time × Group):
\(H_0: \text{NO interaction between Time and Group}\)
\(H_1: \text{Interaction between Time and Group}\)
Repeated Measures Analysis of Variance (ANOVA) is a statistical method used to compare means across levels of an independent factor (e.g., study group) while accounting for within-subject dependencies over time.
Conditions:
Total Variance = Between-Groups Variance + Between-Time Variance + Between-Subjects Variance + Within-Subjects Variance
Between-Group Variance (Group): Measures how much the group means deviate from the grand mean. This is the effect of the group factor.
Between-Time Variance (Time): Measures how much the means at different time points deviate from the grand mean. This captures how much the dependent variable changes over time.
Between-Subjects Variance (Subjects): Measures how much the individual means deviate from the grand mean. This captures how each subject’s mean response differs from the grand mean.
Within-Subjects Variance (Error): This is the remaining variance within subjects that cannot be explained by time, group, or individual difference.
Mauchly’s Test for Sphericity
\(H_0: \text{Homogeneity of variances}\)
\(H_1: \text{No homogeneity of variances}\)
Sphericity is the condition where the variances of the differences between all combinations of related groups (levels) are equal.
The violation of sphericity is serious, causing an increase in the Type I error rate.
Data on weight loss and self esteem over three months, for three groups of individuals: Control, Diet and Diet + Exercise.
library(tidyr)
library(carData)
data("WeightLoss")
WeightLoss <- WeightLoss |>
filter(group!="DietEx") |> # Keep Control and Diet groups
mutate(group =factor(group), # Group as factor
subject = factor(row_number())) |> # Create identifier as factor
dplyr::select(-c(se1,se2,se3)) # Discard self esteem
WeightLoss_long <- WeightLoss |>
pivot_longer(
cols = starts_with("wl"), # Select weight columns
names_to = "Time", # New column for time points
values_to = "Weight" # New column for weight values
) |>
mutate(Time = factor(gsub("wl", "", Time))) # Time as factorWeightLoss_long |>
dplyr::select(-subject) |>
tbl_strata(
strata = group, .tbl_fun = ~ .x |>
tbl_summary(by = Time,
missing = "no",
type = Weight ~ "continuous2",
statistic=all_continuous()~c("{mean} ({sd})",
"{median} ({p25}, {p75})"),
digits = all_continuous() ~ 2,
label = list(Weight ~ "Diff in Kg")) )| Characteristic |
Control
|
Diet
|
||||
|---|---|---|---|---|---|---|
| 1 N = 12 |
2 N = 12 |
3 N = 12 |
1 N = 12 |
2 N = 12 |
3 N = 12 |
|
| Diff in Kg | ||||||
| Mean (SD) | 4.50 (1.00) | 3.33 (1.07) | 2.08 (1.16) | 5.33 (1.37) | 3.92 (1.38) | 2.25 (1.14) |
| Median (Q1, Q3) | 4.50 (4.00, 5.00) | 3.00 (2.50, 4.00) | 2.00 (1.00, 3.00) | 5.50 (4.00, 6.50) | 4.00 (3.00, 5.00) | 2.00 (1.00, 3.00) |
Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
group 1 5.01 5.014 1.478 0.237
Residuals 22 74.64 3.393
Error: subject:Time
Df Sum Sq Mean Sq F value Pr(>F)
Time 2 90.86 45.43 98.86 <2e-16 ***
Residuals 46 21.14 0.46
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ANOVA: dependent ~ independent
model_anova_rm <- aov(Weight ~ Time + group + Error(subject/Time), data = WeightLoss_long)
# Arrange results
results_anova_rm <- broom::tidy(model_anova_rm)
results_anova_rm$term<-c("Group","Residuals","Time","Residual")
names(results_anova_rm)<-c("Stratum","Term","df","SS","MSQ","F","P-value")
# Tabulate results
results_anova_rm |>
gt::gt() |>
gt::fmt_number(decimals = 3) |>
gt::tab_header(title = "Weigth: repeated measure Anova") | Weigth: repeated measure Anova | ||||||
|---|---|---|---|---|---|---|
| Stratum | Term | df | SS | MSQ | F | P-value |
| subject | Group | 1.000 | 5.014 | 5.014 | 1.478 | 0.237 |
| subject | Residuals | 22.000 | 74.639 | 3.393 | NA | NA |
| subject:Time | Time | 2.000 | 90.861 | 45.431 | 98.861 | 0.000 |
| subject:Time | Residual | 46.000 | 21.139 | 0.460 | NA | NA |
We can’t reject the null hypothesis that all means are equal across groups.
We reject the null hypothesis that all means are equal across Time.
Mauchly’s Test for Sphericity
library(ez)
anova_test <- ezANOVA(
data = WeightLoss_long,
dv = Weight, # Dependent variable
wid = subject, # Repeated measure (subject)
within = Time, # Within-subject factor
between = group, # Between-subject factor
detailed = TRUE
)
anova_test[2:3]$`Mauchly's Test for Sphericity`
Effect W p p<.05
3 Time 0.9061593 0.3553429
4 group:Time 0.9061593 0.3553429
$`Sphericity Corrections`
Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
3 Time 0.9142099 6.779951e-16 * 0.9928239 4.540108e-17 *
4 group:Time 0.9142099 2.327443e-01 0.9928239 2.313940e-01
GG: Greenhouse-Geisser epsilon; HFe: Huynh-Feldt epsilon.
group = Control:
contrast estimate SE df t.ratio p.value
Time1 - Time2 1.29 0.196 46 6.601 <.0001
Time1 - Time3 2.75 0.196 46 14.053 <.0001
Time2 - Time3 1.46 0.196 46 7.452 <.0001
group = Diet:
contrast estimate SE df t.ratio p.value
Time1 - Time2 1.29 0.196 46 6.601 <.0001
Time1 - Time3 2.75 0.196 46 14.053 <.0001
Time2 - Time3 1.46 0.196 46 7.452 <.0001
P value adjustment: tukey method for comparing a family of 3 estimates
library(afex) # Properly accounts for repeated measures
model_afex <- aov_car(Weight ~ Time + group + Error(subject/Time), data = WeightLoss_long)
emm_results <-emmeans(model_afex, ~ Time | group)
pairs(emm_results, adjust = "tukey")group = Control:
contrast estimate SE df t.ratio p.value
X1 - X2 1.17 0.251 22 4.655 0.0003
X1 - X3 2.42 0.313 22 7.726 <.0001
X2 - X3 1.25 0.253 22 4.938 0.0002
group = Diet:
contrast estimate SE df t.ratio p.value
X1 - X2 1.42 0.251 22 5.652 <.0001
X1 - X3 3.08 0.313 22 9.857 <.0001
X2 - X3 1.67 0.253 22 6.584 <.0001
P value adjustment: tukey method for comparing a family of 3 estimates
library(afex)
model_afex <- aov_car(Weight ~ Time + group + Error(subject/Time), data = WeightLoss_long)
emm_results <-emmeans(model_afex, ~ group | Time)
pairs(emm_results, adjust = "tukey")Time = X1:
contrast estimate SE df t.ratio p.value
Control - Diet -0.833 0.490 22 -1.701 0.1030
Time = X2:
contrast estimate SE df t.ratio p.value
Control - Diet -0.583 0.504 22 -1.156 0.2599
Time = X3:
contrast estimate SE df t.ratio p.value
Control - Diet -0.167 0.470 22 -0.355 0.7263
# Plot estimated marginal means with confidence intervals
as.data.frame(emm_results) |>
ggplot(aes(x = Time, y = emmean, color = group, group = group)) +
geom_point(size = 3) +
geom_line() +
geom_errorbar(aes(ymin = emmean - SE, ymax = emmean + SE), width = 0.2) +
labs(x = "Time", y = "Estimated Mean Weight") +
theme_minimal() # ANOVA: dependent ~ independent
model_anova_rm <- aov(Weight ~ Time * group + Error(subject/Time), data = WeightLoss_long)
# Arrange results
results_anova_rm <- broom::tidy(model_anova_rm)
results_anova_rm$term<-c("Group","Residuals","Time","Time:Group","Residual")
names(results_anova_rm)<-c("Stratum","Term","df","SS","MSQ","F","P-value")
# Tabulate results
results_anova_rm |>
gt::gt() |>
gt::fmt_number(decimals = 3) |>
gt::tab_header(title = "Weigth: repeated measure Anova") | Weigth: repeated measure Anova | ||||||
|---|---|---|---|---|---|---|
| Stratum | Term | df | SS | MSQ | F | P-value |
| subject | Group | 1.000 | 5.014 | 5.014 | 1.478 | 0.237 |
| subject | Residuals | 22.000 | 74.639 | 3.393 | NA | NA |
| subject:Time | Time | 2.000 | 90.861 | 45.431 | 101.070 | 0.000 |
| subject:Time | Time:Group | 2.000 | 1.361 | 0.681 | 1.514 | 0.231 |
| subject:Time | Residual | 44.000 | 19.778 | 0.449 | NA | NA |
We can’t reject the null hypothesis that all means are equal across groups.
We reject the null hypothesis that all means are equal across Time.
We can’t reject the null hypothesis about NO interaction between Time and Group.
One-way ANOVA: \[y \sim x_1 + \text{error} \]
Two-way ANOVA: \[y \sim x_1 + x_2 + x_1:x_2 +\text{error} \]
Two-way ANOVA repeated measures: \[y \sim x_1 + x_2 + \text{(1|Subject)} \]
ANOVA is a powerful tool for analyzing experimental designs.
The assumption of heterogeneity is key.
Normality? Not so much…
Post hoc analysis: handle with care.
Surprise: ANOVA is a linear model!
Applied Biostatistics Course with R